Automated Paraphrase Quality Assessment Using Language Models and Transfer Learning
نویسندگان
چکیده
Learning to paraphrase supports both writing ability and reading comprehension, particularly for less skilled learners. As such, educational tools that integrate automated evaluations of paraphrases can be used provide timely feedback enhance learner paraphrasing skills more efficiently effectively. Paraphrase identification is a popular NLP classification task involves establishing whether two sentences share similar meaning. quality assessment slightly complex task, in which pairs are evaluated in-depth across multiple dimensions. In this study, we focus on four dimensions: lexical, syntactical, semantic, overall quality. Our study introduces evaluates various machine learning models using handcrafted features combined with Extra Trees, Siamese neural networks BiLSTM RNNs, pretrained BERT-based models, together transfer from larger general corpus, estimate the Two datasets considered tasks involving quality: ULPC (User Language Corpus) containing 1998 smaller dataset 115 based children’s inputs. The MSRP (Microsoft Research 5801 paraphrases. On dataset, our BERT model improves upon previous baseline by at least 0.1 F1-score When fine-tuning children network improve their original scores 0.11 F1-score. results these experiments suggest generic successful, while same time obtaining comparable fewer epochs.
منابع مشابه
the relationship between using language learning strategies, learners’ optimism, educational status, duration of learning and demotivation
with the growth of more humanistic approaches towards teaching foreign languages, more emphasis has been put on learners’ feelings, emotions and individual differences. one of the issues in teaching and learning english as a foreign language is demotivation. the purpose of this study was to investigate the relationship between the components of language learning strategies, optimism, duration o...
15 صفحه اولNeural Paraphrase Generation using Transfer Learning
Progress in statistical paraphrase generation has been hindered for a long time by the lack of large monolingual parallel corpora. In this paper, we adapt the neural machine translation approach to paraphrase generation and perform transfer learning from the closely related task of entailment generation. We evaluate the model on the Microsoft Research Paraphrase (MSRP) corpus and show that the ...
متن کاملMining Models for Automated Quality Assessment of Learning Objects
The present paper presents the results of an alternative approach for automatically evaluating quality inside learning object repositories that considers lower-level measures of the resources as possible indicators of quality. It is known that current repositories face a difficult situation, as their amount of resources tends to increase more rapidly than the number of evaluations provided by t...
متن کاملLearning Paraphrase Models from Google New Headlines
Data sources like the clusters of news headlines at Google News present an exciting opportunity to learn paraphrase models from data automatically. We present both a novel dataset and a novel approach to automatic, unsupervised learning of paraphrase models from that datset. Leveraging existing NLP tools such as the Stanford Parser and lexical resources such as WordNet and Infomap, we construct...
متن کاملFacilitating Translation Using Source Language Paraphrase Lattices
For resource-limited language pairs, coverage of the test set by the parallel corpus is an important factor that affects translation quality in two respects: 1) out of vocabulary words; 2) the same information in an input sentence can be expressed in different ways, while current phrase-based SMT systems cannot automatically select an alternative way to transfer the same information. Therefore,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computers
سال: 2021
ISSN: ['2073-431X']
DOI: https://doi.org/10.3390/computers10120166